Introduction

Exploratory Spatial Data Analysis

Data: National Risk Index Census Tracts

Data Description: The National Risk Index Census Tracts data set was downloaded from FEMA. This includes data on many risk factors at the census tract level across the US. I am specifically looking at Carteret County, NC.

Source: FEMA

Size: 85154 features, 470 columns

Spatial Coverage: Carteret County, North Carolina (data set covers all of United States)

Resolution: Census tract level

Time Period: Updated 2025

This project uses FEMA’s National Risk Index data to explore the spatial relationship between social vulnerability and hurricane risk in Carteret County, North Carolina. The analyses investigate patterns of population exposure, vulnerability, and clustering of risk using spatial statistical methods like mean center, standard distance, Moran’s I, and Getis-Ord Gi*.

Goal:

Exploring spatial patterns of population exposure and vulnerability to natural hazards in Carteret County, North Carolina using spatial statistics.

To understand the relationship between social vulnerability and hurricane risk.

Background

Understanding spatial patterns of exposure to natural hazards is important in policy and public health. Vulnerability in social settings (including poverty and transportation accessibility) is proven to expound on natural disasters (Anderson et al., 2000).

Methods

Began by:

Filtering Carteret County using county FIPS code
Transforming data to a NC relevant projection
Converting polygon data to centroids

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(tidyr)
library(sf)

## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE

library(ggplot2)
library(dplyr)
library(spdep)

## Loading required package: spData
## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source')`

library(sfdep)
library(tmap)
library(arcpullr)

nri_path <- "~/Downloads/GEOG215/FinalProject/NRI_Shapefile_CensusTracts/NRI_Shapefile_CensusTracts.shp"

nri_tracts <- st_read(nri_path)

## Reading layer `NRI_Shapefile_CensusTracts' from data source 
##   `/Users/dalzouby/Downloads/GEOG215/FinalProject/NRI_Shapefile_CensusTracts/NRI_Shapefile_CensusTracts.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 85154 features and 468 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -19942580 ymin: -1617132 xmax: 20012840 ymax: 11526950
## Projected CRS: WGS 84 / Pseudo-Mercator

carteret_tracts <- nri_tracts |>
  filter(STATE == "North Carolina", COUNTY == "Carteret")

carteret_tracts <- st_transform(carteret_tracts, 32119)
st_crs(carteret_tracts)

## Coordinate Reference System:
##   User input: EPSG:32119 
##   wkt:
## PROJCRS["NAD83 / North Carolina",
##     BASEGEOGCRS["NAD83",
##         DATUM["North American Datum 1983",
##             ELLIPSOID["GRS 1980",6378137,298.257222101,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4269]],
##     CONVERSION["SPCS83 North Carolina zone (meters)",
##         METHOD["Lambert Conic Conformal (2SP)",
##             ID["EPSG",9802]],
##         PARAMETER["Latitude of false origin",33.75,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8821]],
##         PARAMETER["Longitude of false origin",-79,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8822]],
##         PARAMETER["Latitude of 1st standard parallel",36.1666666666667,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8823]],
##         PARAMETER["Latitude of 2nd standard parallel",34.3333333333333,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8824]],
##         PARAMETER["Easting at false origin",609601.22,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8826]],
##         PARAMETER["Northing at false origin",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8827]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Engineering survey, topographic mapping."],
##         AREA["United States (USA) - North Carolina - counties of Alamance; Alexander; Alleghany; Anson; Ashe; Avery; Beaufort; Bertie; Bladen; Brunswick; Buncombe; Burke; Cabarrus; Caldwell; Camden; Carteret; Caswell; Catawba; Chatham; Cherokee; Chowan; Clay; Cleveland; Columbus; Craven; Cumberland; Currituck; Dare; Davidson; Davie; Duplin; Durham; Edgecombe; Forsyth; Franklin; Gaston; Gates; Graham; Granville; Greene; Guilford; Halifax; Harnett; Haywood; Henderson; Hertford; Hoke; Hyde; Iredell; Jackson; Johnston; Jones; Lee; Lenoir; Lincoln; Macon; Madison; Martin; McDowell; Mecklenburg; Mitchell; Montgomery; Moore; Nash; New Hanover; Northampton; Onslow; Orange; Pamlico; Pasquotank; Pender; Perquimans; Person; Pitt; Polk; Randolph; Richmond; Robeson; Rockingham; Rowan; Rutherford; Sampson; Scotland; Stanly; Stokes; Surry; Swain; Transylvania; Tyrrell; Union; Vance; Wake; Warren; Washington; Watauga; Wayne; Wilkes; Wilson; Yadkin; Yancey."],
##         BBOX[33.83,-84.33,36.59,-75.38]],
##     ID["EPSG",32119]]

# converting tract polygons to centroids
tract_centroids <- st_centroid(carteret_tracts)

## Warning: st_centroid assumes attributes are constant over geometries

# reprojecting to EPSG: 32119
tract_centroids_proj <- st_transform(tract_centroids, 32119)

Variables:

Hurricane exposure population (HRCN_EXPP) this directly measures the population exposed to hurricane hazards.
Hurricane hazard type risk index score (HRCN_RISKS) provides a composite risk score for hurricanes to help identify areas with the highest risk.
Hurricane annualized frequency (HRCN_AFREQ) indicates the expected frequency of hurricanes per year.
Social vulnerability score (SOVI_SCORE) measures social vulnerability.

Results

Descriptive Statistics

summary(carteret_tracts$HRCN_EXPP)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0    1277    2048    2181    2723    5068

tm_shape(carteret_tracts) + 
  tm_polygons("HRCN_EXPP", 
              fill.legend = tm_legend(title="Hurricane Exposure Population")) +
  tm_title("Hurricane Exposure Population by Census Tract")

summary(carteret_tracts$HRCN_RISKS)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   60.45   96.57   99.12   96.94   99.74   99.94

tm_shape(carteret_tracts) + 
  tm_polygons("HRCN_RISKS", fill.legend = tm_legend(title = "Hurricane Risk Index Score")) +
  tm_title("Hurricane Risk Index Score by Census Tract")

summary(carteret_tracts$HRCN_AFREQ)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.3870  0.4675  0.4675  0.7584  0.8441  3.0792

tm_shape(carteret_tracts) + 
  tm_polygons("HRCN_AFREQ", fill.legend = tm_legend(title ="Hurricane Annualized Frequency")) +
  tm_title("Hurricane Annualized Frequency by Census Tract")

## [plot mode] fit legend/component: Some legend items or map compoments do not
## fit well, and are therefore rescaled.
## ℹ Set the tmap option `component.autoscale = FALSE` to disable rescaling.

summary(carteret_tracts$SOVI_SCORE)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3.49   17.75   35.57   39.19   59.15   85.08

tm_shape(carteret_tracts) + 
  tm_polygons("SOVI_SCORE", fill.legend = tm_legend(title ="Social Vulnerability Score")) +
  tm_title("Social Vulnerability Score by Census Tract")

Initial Maps

# choropleth map of SOVI_SCORE, social vulnerability score
tm_shape(carteret_tracts) +
  tm_fill("SOVI_SCORE", fill.scale = tm_scale(values = "brewer.reds")) +
  tm_borders() +
  tm_title("Social Vulnerability Index")

# choropleth map of HRCN_RISKS, hurricane risk score
tm_shape(carteret_tracts) +
  tm_fill("HRCN_RISKS", fill.scale = tm_scale(values = "brewer.blues")) +
  tm_borders() +
  tm_title("Hurricane Risk Score")

# most of the county is at high risk for hurricanes. interestingly, part of the disconnected portion of land is at a lower risk.

Spatial Statistics

# creating a mean center function
meanCenter <- function(layer, weights=NULL) {
  # converting the data into points
  points <- st_centroid(layer)
  
  # if weights is null, then create weights of 1
  if (is.null(weights)) {
    weights <- rep(c(1), times = nrow(layer))
  }
  
  # calculating the x and y coordinates
  mc_x <- weighted.mean(st_coordinates(points$geometry)[,1], weights)
  mc_y <- weighted.mean(st_coordinates(points$geometry)[,2], weights)
  
  # create a point
  mean_center <- st_sfc(st_point(c(mc_x, mc_y)))
  # add a projection
  mean_center <- st_set_crs(mean_center, st_crs(layer))
  # return the object
  mean_center
}

stdCircle <- function(layer, weights=NULL) {
  points <- st_centroid(layer)
  
  # if weights is null, then create weights of 1
  if (is.null(weights)) {
    weights <- rep(c(1), times=nrow(layer))
  }
  
  # calculate th x and y coordinates
  mc_x <- weighted.mean(st_coordinates(points$geometry) [,1], weights)
  mc_y <- weighted.mean(st_coordinates(points$geometry) [,2], weights)
  mean_center <- meanCenter(layer, weights)
  
  # calculate the standard distance
  sd <- sqrt((sum(weights *
                    (st_coordinates(points$geometry)[,1] - mc_x)^2)
              / sum(weights)) +
               (sum(weights *
                      (st_coordinates(points$geometry)[,2] - mc_y)^2)
                / sum(weights)))
  
  # buffer the mean center with the standard distance
  std_circle <- st_buffer(mean_center, sd)
  
  # return the std_circle
  std_circle
}
# mean center for HRCN_EXPP

mc_expp <- meanCenter(carteret_tracts, carteret_tracts$HRCN_EXPP)

## Warning: st_centroid assumes attributes are constant over geometries

# standard circle for HRCN_EXPP
std_circle_expp <- stdCircle(carteret_tracts, carteret_tracts$HRCN_EXPP)

## Warning: st_centroid assumes attributes are constant over geometries
## Warning: st_centroid assumes attributes are constant over geometries

# mapping mean center and standard circle for hurricane exposure population

tm_shape(carteret_tracts) +
  tm_polygons("HRCN_EXPP", fill.scale = tm_scale(values = "brewer.purples"), fill.legend = tm_legend(title ="Hurricane Exposure Population")) +
  tm_shape(mc_expp) +
  tm_dots(col="black", size = 0.5, fill.legend = tm_legend(title = "Mean Center")) +
  tm_shape(std_circle_expp) +
  tm_borders(col="red", lwd = 2, fill.legend = tm_legend(title = "Standard Circle")) +
  tm_title("Mean Center and Standard Circle: Hurricane Exposure Population")

## [plot mode] fit legend/component: Some legend items or map compoments do not
## fit well, and are therefore rescaled.
## ℹ Set the tmap option `component.autoscale = FALSE` to disable rescaling.

# calculating mean center and standard circle for SOVI_SCORE

# mean center
mc_sovi <- meanCenter(carteret_tracts, carteret_tracts$SOVI_SCORE)

## Warning: st_centroid assumes attributes are constant over geometries

# standard circle
std_circle_sovi <- stdCircle(carteret_tracts, carteret_tracts$SOVI_SCORE)

## Warning: st_centroid assumes attributes are constant over geometries
## Warning: st_centroid assumes attributes are constant over geometries

# mapping mean center and standard circle for social vulnerability score

tm_shape(carteret_tracts) +
  tm_polygons("SOVI_SCORE", fill.scale = tm_scale(values = "brewer.blues"), fill.legend = tm_legend(title ="Social Vulnerability Score")) +
  tm_shape(mc_sovi) +
  tm_dots(col="black", size = 0.5, fill.legend = tm_legend(title = "Mean Center")) +
  tm_shape(std_circle_sovi) +
  tm_borders(col="red", lwd = 2, fill.legend = tm_legend(title = "Standard Circle")) +
  tm_title("Mean Center and Standard Circle: Social Vulnerability Score")

## [plot mode] fit legend/component: Some legend items or map compoments do not
## fit well, and are therefore rescaled.
## ℹ Set the tmap option `component.autoscale = FALSE` to disable rescaling.

Hot-Spot Analysis

Here hot-spot analysis is performed on HRCN_EXPP and SOVI_SCORE variables to identify areas with high exposure and high social vulnerability.

High positive Gi* values indicate significant hot spots or clusters of high population.

Low negative Gi* values indicate significant cold spots or clusters of low population.

Values near zero indicate no local clustering.

Z-scores above 1.96 or below negative 1.96 are statistically significant at the 95% confidence level.

# hot spot analysis for HRCN_EXPP hurricane exposure population

nb <- poly2nb(carteret_tracts, queen=TRUE, snap = 1e-5)

## Warning in poly2nb(carteret_tracts, queen = TRUE, snap = 1e-05): some observations have no neighbours;
## if this seems unexpected, try increasing the snap argument.

## Warning in poly2nb(carteret_tracts, queen = TRUE, snap = 1e-05): neighbour object has 4 sub-graphs;
## if this sub-graph count seems unexpected, try increasing the snap argument.

lw<- nb2listw(nb, style = "W", zero.policy = TRUE)

gi_star_expp <- localG(carteret_tracts$HRCN_EXPP, lw, zero.policy = TRUE)

carteret_tracts$gi_star_expp <- as.numeric(gi_star_expp)

tm_shape(carteret_tracts) +
  tm_polygons("gi_star_expp", fill.legend = tm_legend(title = "Getis-Ord Gi* Z-scores"),
              fill.scale = tm_scale_intervals(midpoint = NA, values = "brewer.reds")) + 
  tm_title("Hot Spot Analysis: Hurricane Exposure Population")

# hot spot analysis for SOVI_SCORE social vulnerability score

gi_star_sovi <- localG(carteret_tracts$SOVI_SCORE, lw, zero.policy = TRUE)

carteret_tracts$gi_star_sovi <- as.numeric(gi_star_sovi)

tm_shape(carteret_tracts) +
  tm_polygons("gi_star_sovi", fill.legend = tm_legend(title = "Getis-Ord Gi* Z-scores"), fill.scale = tm_scale_intervals(midpoint = NA, values = "brewer.yl_gn")) + 
  tm_title("Hot Spot Analysis: Social Vulnerability Score")

Calculating the correlation coefficient between HRCN_RISKS and SOVI_SCORE to understand how social vulnerability affects hurricane risk.

# calculating correlation coefficient
correlation <- cor(carteret_tracts$HRCN_RISKS,
                   carteret_tracts$SOVI_SCORE, use = "complete.obs")|>
  print()

## [1] 0.1006939

The resulting correlation coefficient of 0.1006939 means there is a weak but positive correlation between social vulnerability and hurricane hazard risk scores. This means that tracts with higher vulnerability tend to have a somewhat higher risk.

# morans I for HRCN_EXPP

moran_expp <- moran.test(carteret_tracts$HRCN_EXPP, lw, zero.policy = TRUE)|>
  print()

## 
##  Moran I test under randomisation
## 
## data:  carteret_tracts$HRCN_EXPP  
## weights: lw  
## n reduced by no-neighbour observations  
## 
## Moran I statistic standard deviate = 0.48855, p-value = 0.3126
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##        0.03770659       -0.03571429        0.02258540

The p-value is greater than 0.05 meaning it is not statistically significant and there no significant spatial autocorrelation. This means that hurricane population exposure is not significantly clustered.

The lack of spatial autocorrelation for hurricane exposure implies a more dispersed risk pattern, suggesting that emergency planning cannot focus solely on clustered regions.

# morans I for SOVI_SCORE

moran_sovi <- moran.test(carteret_tracts$SOVI_SCORE, lw, zero.policy = TRUE)|>
  print()

## 
##  Moran I test under randomisation
## 
## data:  carteret_tracts$SOVI_SCORE  
## weights: lw  
## n reduced by no-neighbour observations  
## 
## Moran I statistic standard deviate = 2.1634, p-value = 0.01526
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##        0.29515878       -0.03571429        0.02339172

The p-value here is less than 0.05 so there is significant spatial clustering for social vulnerability. This means that tracts with similar social vulnerability tend to be closer to one another.

# creating neighbors and weights
nb <- knn2nb(knearneigh(tract_centroids_proj, k = 5))
wt <- nbdists(nb, tract_centroids_proj)
inv_wt <- lapply(wt, function(x) 1/x)
lw <- nb2listw(nb, glist = inv_wt, style = "W")
# running local g on sovi_score

gi_zscores <- localG(tract_centroids_proj$SOVI_SCORE, lw)
tract_centroids_proj$gi <- as.numeric(gi_zscores)
tract_centroids_proj$p_val <- 2 * pnorm(-abs(tract_centroids_proj$gi))

# classifying the results
tract_centroids_proj <- tract_centroids_proj |>
  mutate(
    classification = case_when(
      gi > 0 & p_val <= 0.01 ~ "Very hot",
      gi > 0 & p_val <= 0.05 ~ "Hot",
      gi > 0 & p_val <= 0.1 ~ "Somewhat hot",
      gi < 0 & p_val <= 0.01 ~ "Very cold",
      gi < 0 & p_val <= 0.05 ~ "Cold",
      gi < 0 & p_val <= 0.1 ~ "Somewhat cold",
      TRUE ~ "Insignificant"
    ),
    classification = factor(
      classification,
      levels = c("Very hot", "Hot", "Somewhat hot", "Insignificant", "Somewhat cold", "Cold", "Very cold")
    )
  )

# mapping the results
tm_shape(carteret_tracts) +
  tm_polygons()+ 
  tm_shape(tract_centroids_proj) + 
  tm_symbols(fill = "classification", size = 0.5, 
             fill.scale = tm_scale_categorical(n=7, 
                                               values = "tableau.red_blue_diverging"), 
             fill.legend = tm_legend(title = "Hot Spot Classification" ))

Discussion

The analyses executed revealed patterns of vulnerability and exposure within Carteret County, NC. The hot spot analysis identified areas with both high hurricane exposure and social vulnerability, highlighting the most vulnerable populations.

Moran’s I confirmed that the patterns we identified are not random but meaningfully clustered.

To note: some limitations may include missing variables that influence vulnerability.

What worked well:

Spatial analysis: Moran’s I and hot spot analysis revealed spatial patterns
Visualizations: Maps clearly showed areas of risk

Future Improvements:

Adding more variables

Conclusion

We found that vulnerability is spatially clustered, but exposure is not. Vulnerability and risk have a weak positive correlation. The spatial analysis of FEMA’s National Risk Index highlighted areas where hurricane risk and social vulnerability overlap. This points out areas where policymakers can prioritize resources to prevent devastation during natural disasters. This spatial analysis of Carteret County reveals that while hurricane exposure does not show strong spatial clustering, social vulnerability does. The weak correlation between vulnerability and risk suggests that while higher vulnerability often coincides with higher risk, other factors can also play a role. The hot spot analysis highlights localized patterns that could guide policy responses targeted to the vulnerable populations.

References

AGU Publications - Wiley Online Library, agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2016GL070971. Accessed 25 April 2025.

FinalProject

Duaa Alzouby

2025-04-20